A Diversity-Enhanced and Constraints-Relaxed Augmentation for Low-Resource Classification
نویسندگان
چکیده
Previous studies on Data Augmentation (DA) mostly use a fine-tuned Language Model (LM) to strengthen the constraints but ignore fact that potential of diversity could improve effectiveness generated data. To address this dilemma, we propose Diversity-Enhanced and Constraints-Relaxed (DECRA) has two essential components top transformer-based backbone model, including \(\mathbf {k}\)-\(\varvec{\beta }\) augmentation masked language model loss. Extensive experiments demonstrate our DECRA outperforms state-of-the-art approaches by 3.8% in overall score.
منابع مشابه
Data augmentation for low resource languages
Recently there has been interest in the approaches for training speech recognition systems for languages with limited resources. Under the IARPA Babel program such resources have been provided for a range of languages to support this research area. This paper examines a particular form of approach, data augmentation, that can be applied to these situations. Data augmentation schemes aim to incr...
متن کاملthe innovation of a statistical model to estimate dependable rainfall (dr) and develop it for determination and classification of drought and wet years of iran
آب حاصل از بارش منبع تأمین نیازهای بی شمار جانداران به ویژه انسان است و هرگونه کاهش در کم و کیف آن مستقیماً حیات موجودات زنده را تحت تأثیر منفی قرار می دهد. نوسان سال به سال بارش از ویژگی های اساسی و بسیار مهم بارش های سالانه ایران محسوب می شود که آثار زیان بار آن در تمام عرصه های اقتصادی، اجتماعی و حتی سیاسی- امنیتی به نحوی منعکس می شود. چون میزان آب ناشی از بارش یکی از مولفه های اصلی برنامه ...
15 صفحه اولData Augmentation for Low-Resource Neural Machine Translation
The quality of a Neural Machine Translation system depends substantially on the availability of sizable parallel corpora. For low-resource language pairs this is not the case, resulting in poor translation quality. Inspired by work in computer vision, we propose a novel data augmentation approach that targets low-frequency words by generating new sentence pairs containing rare words in new, syn...
متن کاملTraining Data Augmentation for Low-Resource Morphological Inflection
This work describes the UoE-LMU submission for the CoNLL-SIGMORPHON 2017 Shared Task on Universal Morphological Reinflection, Subtask 1: given a lemma and target morphological tags, generate the target inflected form. We evaluate several ways to improve performance in the 1000-example setting: three methods to augment the training data with identical input-output pairs (i.e., autoencoding), a h...
متن کاملVehicle Classification on Low-resolution and Occluded images: A low-cost labeled dataset for augmentation
Video image processing of traffic camera feeds is useful for counting and classify1 ing vehicles, estimating queue length, traffic speed and also for tracking individual 2 vehicles. Even after over three decades of research, challenges remain. Vehicle 3 detection is especially challenging when vehicles are occluded which is common 4 in heterogeneous traffic. Recently Deep Learning has shown rem...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Lecture Notes in Computer Science
سال: 2021
ISSN: ['1611-3349', '0302-9743']
DOI: https://doi.org/10.1007/978-3-030-73197-7_17